In this section, I follow \citet[Chapter~5]{campbell2017financial} and discuss some 
of the conceptual building blocks for the strand of time-series empirical finance literature.

\subsection{Market Efficiency}\label{chap1:sec1:ssec1}
An intuitive way of explaining \textbf{\textit{market efficiency}} is that efficient markets
are competitive and allow no easy ways to make economic profit. A more useful and testable
definition was given by \citet[p.~127]{malkiel1989efficient}:

\begin{quote}
    The market is said to be efficient with respect to some information set $\phi$,
    if security prices would be unaffected by revealing that information to all participants.
\end{quote}

Some event studies that measure market responses to news announcements can be interpreted
as tests of market efficiency regarding the announced information, but in general, this 
definition is not easy to test. On the other hand, \cite{malkiel1989efficient} gives a more testable alternative:
\begin{quote}
    Efficiency with respect to an information set $\phi$ implies that it is impossible to
    make economic profits by trading on the basis of $\phi$.
\end{quote}

This is the idea behind an enormous literature in empirical asset pricing: if an economic model
defines the equilibrium return as $\Theta_{i,t}$, then the null hypothesis is
\begin{equation}
    R_{i,t+1} = \Theta_{i,t}+U_{i,t+1}
\end{equation}
where $U_{i,t+1}$ is a FAIR game regarding the information set at t, or $\mathbb{E}(U_{i,t+1}|\phi_t)=0$.
Notice that market efficiency is equivalent to rational expectations, one must text a model of
expected returns as well when testing market efficiency. After defining a model of expected returns,
the variables to be included in the information set must be specified. \citet{malkiel1970efficient} define
three forms of efficient market hypothesis and the corresponding information sets:
\begin{enumerate}
    \item[-] the \textbf{\textit{weak form}}: past returns
    \item[-] the \textbf{\textit{semi-strong form}}: publicly available information such as stock splits, dividends, or earnings
    \item[-] \sidenotes{$\leftarrow$ this could be tested by using measureable actions (trades or portfolio holdings) of the potentially better informed agents} the \textbf{\textit{strong form}}: information available to some market participants, but NOT necessarily to all participants.
\end{enumerate}

In the time-series literature, the simplest economic model is constant return: $\Theta_{i,t}=\Theta$. In Section \ref{chap1:sec2}, I summarize
the early literature focusing on this model.

Market efficiency has been widely tested and debated, now the most accepted view of market efficiency hypothesis is that
it is a useful benchmark but does not hold perfectly. The debates between long-term versus short-term efficiency, micro versus
macro efficiency are still and will continue to be heated. Some noticable alternative hypotheses are:
\begin{enumerate}
    \item[-] \textbf{\textit{High-frequency noise}}: market prices are contaminated by short-term noise, which can be caused by measurement errors or illiquidity (bid-ask bounce).
    \item[-] \textbf{\textit{Inperfect information processing}}: the market reacts sluggishly to information after its releasing
    \item[-] \textbf{\textit{Persistent mispricing}}: market prices deviate substantially from efficient levels in a LONG time
    \item[-] \textbf{\textit{Disposition effect}}: individual investors are more willing to sell winning stocks then losing stocks, see \citet{shefrin1985disposition} for details.
\end{enumerate}

\subsection{Model: autocorrelation of returns}\label{chap1:sec1:ssec2}
The most basic time-series test of market efficiency is to test "whether past deviations of returns from model-implied expected returns
predict future return deviations" \citep[See][p.~124]{campbell2017financial}. The leading approach to do so is to test the autocorrelations.

Starting points:
\begin{enumerate}
    \item The null hypothesis $H_0$: the stock returns are i.i.d. 
    \item The standard error for any single sample autocorrelation equals asymptotically $1/\sqrt{T}$, see \citet{box1970distribution} for a detailed discussion.
    \item The standard error would be large, (0.1 if $T=100$), not so easy to detect small autocorrelation
\end{enumerate}
Any autocorrelation test would have to solve these issues.

\subsubsection{Q-statistics}
Because the stock returns are i.i.d. ($H_0$), different autocorrelations are uncorrelated with one another.
\citet{box1970distribution} calculates a sum of $K$ squared sample autocorrelations:
\begin{equation}
    Q_K = T\sum^K_{j=1}\hat{\rho}^2_j
\end{equation}
where $\hat{\rho}_j=\hat{Corr}(r_t,r_{t-j})$. Q is asymptotically distributed $\chi^2$ with K degrees of freedom.

\textbf{Pros:} It solves the problem of the large standard errors.

\textbf{Cons:} It does NOT use the sign of the autocorrelations (squared). What could happen is that the expected reutrns are
not constant, instead, they are each individually small but all have the same sign.

\subsubsection{Variance ratio}
One way to take the sign of autocorrelations into consideration is the variance ratio statistic.
This statistic was introduced to the finance literature by \citet{lo1988stock} and \citet{poterba1988mean}.

\textbf{The basic setting is}: for a holding period $K$, the log return of this entire period $r_t(K)$ is the sum of all the one-period returns $r_{t+i}$:
$$
r_t(K) \equiv r_t + r_{t+1} + \cdots + r_{t+K-1}
$$
and the variance ratio over the period $K$ would be defined as:
$$
    VR(K) = \frac{Var(r_t(K))}{K\cdot Var(r_{t})}
$$
If there are not autocorrelations, then the i.i.d. returns would have identical variance in each period from $t$ to $t+K$,
and $Var(r_t(K)) = Var(r_t+\cdots+r_{t+K-1})=Var(r_t)+\cdots+Var(r_{t+K-1})=K\cdot Var(r_t)$. Thus, $VR(K)=1$. If we rewrite
the definition of the variance ratio as:
\begin{equation}\label{eq1.3}
    VR(K) = \frac{Var(r_t(K))}{K\cdot Var(r_{t})} = 1+\underbrace{2\sum^{K-1}_{j=1}\left(1-\frac{j}{K}\right)\hat{\rho}_j}_{\text{weighted average of the first $K-1$ sample autocorrelations}}
\end{equation}

Then by comparing $VR(K)$ with 1, we can deduct the direction of the autocorrelations:
\begin{center}
    \begin{tabular}{rl}
    \hline
    $VR(K) > 1$ & predominantly \textbf{positive} autocorrelations  \\ 
    $VR(K) = 1$ & no autocorrelations \\ 
    $VR(K) < 1$ & predominantly \textbf{negative} autocorrelations: mean reversion \\
    \hline
    \end{tabular}
\end{center}
Notice that the weight term $1-\frac{j}{K}$ increases as $j$ approaches $K$\footnote{\citet{cochrane1988big} showed that the estimator of VR(K) can be 
interpreted in terms of the frequency domain. It is asymptotically equivalent to $2\pi$ times the normalized spectral density
estimator at the zero frequency, which uses the Bartlett kernel.}.

The asymptotic variance of the variance-ratio statistic, under $H_0$ (i.i.d. returns), is:
\begin{equation}
    Var(\hat{VR}(K))=\frac{4}{T}\sum^{K-1}_{j=1}\left(1-\frac{j}{K}\right)^2=\frac{2(2K-1)(K-1)}{3KT} \xrightarrow{K\rightarrow \infty} \frac{4K}{3T}
\end{equation}

When $K\rightarrow\infty,T\rightarrow\infty,$ and $K/T\rightarrow 0$ \citep[p.~463]{priestley1981spectral}, 
the true return process can be serially correlated and heteroskedastic, but the variance of the variance-ratio
is still given as:
\begin{equation}
    Var(\hat{VR}(K)) = \frac{4K}{3T}\cdot VR(K)^2
\end{equation}
Notice that this can be quite large with a large $VR(K)$. This is due to the fact that $K/T\rightarrow0$ is a dangerous 
assumption because in practice $K$ is often large relative to the sample size. To tackle this, \citet{lo1988stock} develop
alternative asymptotics assuming $K/T\rightarrow \delta$ where $\delta>0$. Through Monte Carlo simulations, they demonstrated
that this new distribution is a more robust approximation to the small-sample distribution of the VR statistic. Most current applications
of the VR statistic cite $K/T \rightarrow \delta >0$ as the justification for using Monte Carlo distributions (i.e. set at $K=\delta T$) as
representative of the VR statistic's sampling distribution. Some recent challenges of this result are discussed in Section \ref{chap1:sec1:ssec3}.

To accommodate $r_t$'s exhibiting conditional heteroskedasticity, \citet{lo1988stock} proposed a heteroskedasticity-robust
variance estimation of VR(K) as:
$$
Var^*(\hat{VR}(K)) = 4\sum^{K-1}_{j=1}\left(1-\frac{j}{K}\right)^2 \cdot \frac{\sum^T_{t=j+1}(r_t-\bar{r})^2(r_{t-j}-\bar{r})^2}{\left[\sum^T_{t=1}(r_t-\bar{r})^2\right]^2}
$$
where $\bar{r}=\frac{1}{T}\sum^{T}_{t=1}r_t$ is the estimated mean of returns.

\subsubsection{Regression approach}
\citet{fama1988permanent} established a regression approach to test AR(K). The basic idea is to regress the
K-period return on the lagged K-period return:
$$
    r_t(K) = \alpha_K + \beta_K r_{t-K}(K)+\epsilon_t^K
$$
The coefficient $\beta_K$ would then be:
\begin{equation}
    \beta_K = \frac{Cov[r_t(K),r_{t-K}(K)]}{Var[r_{t-K}(K)]} = 2\left[\frac{VR(2K)}{VR(K)}-1\right]=\frac{2\sum^{K-1}_{j=1}\left(\frac{\min (j,2K-j)}{K}\right)\rho_j}{VR(K)}
\end{equation}
It is clear to see that:
\begin{center}
    \begin{tabular}{rl}
    \hline
    $\beta_K > 0$ & predominantly \textbf{positive} autocorrelations  \\ 
    $\beta_K = 0$ & no autocorrelations \\ 
    $\beta_K < 0$ & predominantly \textbf{negative} autocorrelations: mean reversion \\
    \hline
    \end{tabular}
\end{center}

\subsection{Extension: Other variance-ratio tests}\label{chap1:sec1:ssec3}
As summarized by \citet{charles2009variance}, the intuition behind the VR test is rather simple, but conducting a statistical inference using the VR test is less
straightforward. In this bonus subsection, I briefly summarize some recent development of individual VR tests, multiple VR tests and bootstrapping VR tests.
For more detailed discussion, see \citet{charles2009variance} for a review.

\subsubsection{Individual VR tests}
Conventional VR tests, such as the \citeauthor{lo1988stock} test, are asymptotic tests: their sampling distributions are approximated by their limiting distributions.
In practice, the asymptotic theory provides a poor approximation to the small-sample distribution of the VR statistic, which impeded the use of the statistic.
In general, the ability of the asymptotic distribution to approximate the finite-sample distribution depends crucially on the value of $K$. For a large $K$ relative to T,
\citet{lo1990contrarian} have proved that the VR statistics are severely biased and right skewed. Several alternative tests try to tackle this issue.

\textbf{\citet{chen2006variance} Test}: they suggested a simple power transformtaion of the VR statistic when $K$ is NOT too large. This transformation is able to solve 
the right-skewness problem and robust to conditional heteroskedasticity. They showed that the transformed VR statistic leads to significant gain in power against mean reverting 
alternative. They define the VR statistic based on the periodogram and this new statistic is precisely the normalized discrete periodogram average estimate of the spectral
density of a stationary process at the origin.

\textbf{\citet{wright2000alternative} Test}: they proposed a non-parametric alternative using signs and ranks. This test outperforms the \citeauthor{lo1988stock} test in 2 ways:
\begin{enumerate}
    \item[(1)] As the rank and sign tests have an exact sampling distribution, there is no need to resort to asymptotic distribution approximation.
    \item[(2)] The tests may be more powerful against a wide range of models displaying serial correlation, including fractionally integrated alternatives.
\end{enumerate}
The rank-based tests display low-size distortions under conditional heteroskedasticity. One thing to notice is that the sign test assumes a zero drift value,
\citet{luger2003exact} extended this test with unknown drift.

\textbf{\citet{choi1999testing} Test}: To overcome the issue that the arbitrary and \textit{ad hoc} choice of $K$, \citet{choi1999testing} proposed a data-dependent procedure to
determine the optimal value of $K$. This test is based on frequency domain following \citet{cochrane1988big}. However, instead of using Bartlett kernel as \citeauthor{cochrane1988big},
\citeauthor{choi1999testing} employed the quadratic spectral kernel since it's optimal in estimating the spectral density at the zero frequency.

\subsubsection{Multiple VR tests}
All the tests above are individual tests, where the null hypothesis is tested for an individual value of $K$. However, to determine whether a time series is a random walk, we need to rule
out all possibilities, meaning that for all values of $K$, the null hypothesis can not be rejected. It is necessary to conduct a joint test where a multiple comparison of VRs over a set of
different time horizons is made. However, conducting separate individual tests for a number of $K$ values may lead to over rejection of the null hypothesis. Several tests have been developed 
for this problem, with the joint null hypothesis $H_0: \forall K_i, V(K_i)=1$ against the alternative $H_1: \exists K_i, V(K_i)\neq 1$

\textbf{\citet{chow1993simple} Test}: This test statistic is defined as 
$$MVR_1=\sqrt{T}\max_{1\leq i \leq m}|M_1(K_i)|$$
where $M_1(K_i) = \frac{VR(K_i)-1}{\sqrt{2(2K-1)(K-1)/3KT}}$. This is based on the idea that the decision regarding the null hypothesis can be obtained from the maximum absolute value of
the individual VR statistics. Then, they applied the Sidak probability inequality and give an upper bound to the critical values taken in the studentized maximum modulus (SMM) distribution.
The statistic follows the SMM distribution with $m$ and $T$ degrees of freedom, where $m$ is the number of $K$ values. To accommodate heteroskedasticity, one can change $M_1(K_i)$ into 
a heteroskedasticity-robust individual VR test.

\textbf{\cite{whang2003multiple} Test}: They use a subsampling technique to develop a multiple VR test. When sample size ($T$) is relatively small, this test outperforms the conventional VR
tests, and shows little to no serious size distortions. The statistic is:
$$
MVR_T = \sqrt{T}\max_{1\leq i \leq m}|VR(K_i)-1|
$$
and the sampling distribution function for the $MVR_T$ statistic is asymptotically a maximum of a multivariate normal vector with an unknown covariance matrix, which would be complicated to
estimate. Therefore, they proposed to approximate the null distribution by means of the subsampling approach.
For a subsample of size $b$: $(x_t,\cdots,x_{t-b+1})$ where $t=1,\cdots,T-b+1$. The statistic $MVR$s calculated from all individual subsamples would generate a $(1-\alpha)$th percentile for the
$100(1-\alpha)\%$ critical value. To implement this subsampling technique, a choice of block length $b$ must be made. \citet{whang2003multiple} recommended the interval of 
$(2.5T^{0.3},3.5T^{0.6})$, but they also found that the size and power properties of their test are not sensitive to $b$.

\textbf{\cite{belaire2004ranks} Test}: This is a multiple rank and sign VR tests, an extension to the \citeauthor{wright2000alternative}'s rank- and sign-based tests. The test is based on 
the definition of \citet{chow1993simple} procedure. The rank-based procedures are exact under the i.i.d. assumption whereas the sign-based procedures are exact under both the i.i.d. and
martingale difference sequence assumption. They showed that rank-based tests are more powerful than their sign-based counterparts.

\textbf{\citet[Wald-Type Test]{richardson1991tests}}: They suggested a joint test based on the following Wald statistic:
$$
MVR_{RS}(K) = T(\mathbf{VR}-\mathbf{1})'\mathbf{\Phi}^{-1}(\mathbf{VR}-\mathbf{1})
$$
where $\mathbf{VR}$ is the $K\times 1$ vector of sample $K$ VRs, $\mathbf{1}$ is the $K\times 1$ unit vector; $\Phi$ is the covariance matrix of $\mathbf{VR}$. This statistic $MVR_{RS}(K)$ follows
a $\chi^2$ distribution with $K$ degrees of freedom. One thing to remember about this test is that the VR tests are computed over Long lags with overlapping observations, the distribution 
of the VR test is NON-normal.

\textbf{\citet[Wald-Type Test]{cecchetti1994variance}}: They also developed a Wald-type multiple VR statistic that incorporates the correlations between VR statistics at various horizon and weights
them according to their variances:
$$
MVR_{CL}(K) = [\mathbf{VR}(K)-\mathbb{E}[\mathbf{VR(K)}]]'\mathbf{\Psi}^{-1}(K)[\mathbf{VR}(K)-\mathbb{E}[\mathbf{VR(K)}]]
$$
Again, $\mathbf{VR}(K)$ is a vector of VR statistics and $\mathbf{\Psi}$ is a measure of the covariance matrix of $\mathbf{VR}$; and again, this statistic follows a $\chi^2$ distribution with $K$ degrees
of freedom. However, after using Monte Carlo techniques to study the empirical distribution of $MVR_{CL}(K)$, they have found that it has large positive skewness, not $\chi^2$.

\textbf{\citet[Wald-Type Test]{chen2006variance}}: They proposed a joint VR test based on their individual power transformed VR statistic, also following a $\chi^2$ distribution with $K$ degrees of freedom.
One feature sets the \citeauthor{chen2006variance} test apart from the \citeauthor{richardson1991tests} test and the \citeauthor{cecchetti1994variance} test: this test is with ONE-sided alternative (i.e., 
$H_1: \exists K_i$ s.t. $VR(K_i)<1$).

\subsubsection{Bootstrapping VR tests}
Instead of using the subsampling method, some researchers proposed to employ a bootstrap method, which is distribution-free and can be used to estimate the sampling distribution of the VR statistic when 
the distribution of the original population is unknown.

\textbf{\citet{kim2006wild} Test}: \citet{kim2006wild} applied the wild bootstrap to the \citeauthor{lo1988stock} test and the \citeauthor{chow1993simple} test in 3 stages:
\begin{enumerate}
    \item[(1)] From a bootstrap sample of $T$ observations $X_t^{*}=\eta_tX_t$, where $\eta_t$ is a random sequence with $E(\eta)=0, E(\eta^2)=1$ and $t=1,\cdots,T$
    \item[(2)] For the bootstrap sample generated in (1), calculate $VR^{*}=VR(X^{*},K_i)$
    \item[(3)] Repeat (1) and (2) for a sufficient amount of times $m$, to form a bootstrap distribution of the test statistic $\left\{VR(X^{*},K_j; j)\right\}^m_{j=1}$
\end{enumerate}
Conditional on $X_t$, $X_t^*$ is a serially uncorrelated sequence with zero mean and variance $X_t^2$. Thus, wild bootstrapping approximates the sampling distributions under the null hypothesis, which is
a desirable property for a bootstrap test. To perform the test, a specific form of $\eta_t$ should be chosen. \citet{kim2006wild} recommended using the standard normal distribution for $\eta_t$.

\textbf{\citet{malliaropulos1999mean} Test}: They used a weighted bootstrap method proposed by \citet{wu1986jackknife}, which is heteroskedasticity-robust and done by resampling normalized returns
instead of actual returns. The bootstrap scheme can be summarized in 4 steps:
\begin{enumerate}
    \item[(1)] For each $t$, draw a weighting factor $z_t^*$ with replacement from the empirical distribution of normalized returns $z_t = (r_t-\bar{r})/\sigma(r)$, where $t=1,\cdots,T$, $\bar(r)=E(r)$, $\sigma(r)=Var(r)$.
    \item[(2)] Form the bootstrap sample of $T$ observations $\tilde{r}^*_t=z_t^*r_t$
    \item[(3)] Calculate the VR statistic $VR^*(K)$ from the sample $r_t^*$
    \item[(4)] Repeat steps (1) and (2) M times, obtaining $VR^*(K;m)_{1\leq m \leq M}$ 
\end{enumerate}
Using this procedure, resampling from normalized returns, the weighted bootstrap method accounts for the possible non-constancy of the variance of returns. This weighted bootstrap scheme is designed to 
overcome the difficulty that resampling methods may generate data that are less dependent than the original data. One thing to notice is that \citeauthor{malliaropulos1999mean}'s method is not asymptotically
pivotal and not supported by any asymptotic theory or Monte Carlo evidence to evaluate its properties, unlike \citeauthor{kim2006wild}'s method.